add resnet50 example #3266

MoyanZitto · 2016-07-20T07:38:40Z

This is a Keras implmentation of Kaiming He's residual network (50 layers).
The layers have been properly named so that it would be easy if anyone want to load the pretrained weights converted from Kaiming he's caffemodel file.

giorgiop · 2016-07-20T07:49:14Z

Now sure if that's still active, but #2793 is on Resnet as well.

MoyanZitto · 2016-07-20T08:46:28Z

@giorgiop yes I've check the issue you mentioned, but I think these 2 scripts are not same. My commit is exactly what Kaiming He published in his github, acturally I have converted the pretrained caffemodel provided by Kaiming He to Keras h5 file. Once I finished my tests, I'll update this scripts so that people could get pre-trained resnet50 model directly from keras examples.
I think this would be helpful to many researchers.
Thank you!

This script get 10 points in PEP8 tests on my computer but.....

fchollet · 2016-07-20T17:07:50Z

acturally I have converted the pretrained caffemodel provided by Kaiming He to Keras h5 file

In that case I think you should provide the Keras weights, and in your script demonstrate how to load the pre-trained weights and run inference on some images. It would be a great addition.

fchollet · 2016-07-20T23:13:22Z

One thing to consider would be to provide two version of the weights file: one for Theano and one for TensorFlow, since they differ: https://github.com/fchollet/keras/wiki/Converting-convolution-kernels-from-Theano-to-TensorFlow-and-vice-versa

MoyanZitto · 2016-07-21T03:03:46Z

@fchollet Ok, I'll do it as soon as possible~

MoyanZitto · 2016-07-21T09:28:07Z

@fchollet Hey I have finished my test on this script, it works well, this is a screen shot from my IPython:

In the comments at the top of this script, I release the address where people can download pretrianed h5 file. This weight file is only for tensorflow backend for now. I tried to convert it to theano backend version but failed. (The weights could be loaded but the test result is incorrect). Maybe you can just merge this PR and I'll keep trying.

And, since gist is blocked by Great Wall in China, the converted weights were uploaded on "Baidu" cloud drive. Maybe someone could donwload from baidu cloud drive and then upload it to gist. I think more or little, the Chinese words on baidu cloud drive is annoying to those tho speak English.

Thank you! I'm goint to fix the endless PEP8 problems...

fchollet · 2016-07-21T17:44:36Z

This weight file is only for tensorflow backend for now. I tried to convert it to theano backend version but failed. (The weights could be loaded but the test result is incorrect). Maybe you can just merge this PR and I'll keep trying.

Can you clarify what you did and what went wrong?

The only difference between Theano and TensorFlow is the fact that TensorFlow uses flipped kernels (because it does correlation, not convolution) in Convolution2D kernels. Simply iterating over all convolution layers and flipping the kernels is enough to convert the weights file. There are no other differences.

fchollet · 2016-07-22T03:48:52Z

Btw the link you provide does not work. 啊哦，你所访问的页面不存在了。

MoyanZitto · 2016-07-22T04:39:36Z

@fchollet Good afternoon, here's how I transfer tf weights to th weights:

from keras import backend as K
from keras.utils.np_utils import convert_kernel
import h5py


f_th = h5py.File('thresnet50.h5','w')
f_tf = h5py.File('resnet50.h5','r')

for k in f_tf.keys():
    grp = f_th.create_group(k) # create group fpr each layer
    if k[:3]=='res' or k[:4]=='conv': #which means it is a conv layer
        grp.create_dataset('weights',data=convert_kernel(f_tf[k]['weights'][:])) # for conv layer, call convert_kernel to transfer weight into th

    else:
        grp.create_dataset('weights',data=f_tf[k]['weights'][:]) # else just keep it still

    grp.create_dataset('bias',data=f_tf[k]['bias'][:]) # store the bias term
f_th.close()
f_tf.close()

Basically th weights is just a copy of tf weights, with the only exception that for conv layer, the weights is converted by convert_kernel.

After this transformation, I swich backend to Theano and load this transfered weights. But the prediction result is incorrect, both two test images were predicted to be "n02443485 black-footed ferret, ferret, musterla nigrips"

I have to admit that I didn't spend much time on transfering the weights, maybe I should be more careful. Perhaps there will be some good news when you wake up tomorrow.

BTW, I can visit the link and download weights normally. If it doesn't work for you, I'll try to upload it to elsewhere. Microsoft OneDrive could be a good choice.

Thank you~

MoyanZitto · 2016-07-22T11:20:26Z

Hi @fchollet , a good news and a bad one.
The good news is I use model.save_weights() to resave the pretrained weights, now we can just use 'model.load_weights()' to load the weights.

The bad news is that I use the code in https://github.com/fchollet/keras/wiki/Converting-convolution-kernels-from-Theano-to-TensorFlow-and-vice-versa , and then use save_weights like this:

from keras import backend as K
from keras.utils.np_utils import convert_kernel
import res_net50
import h5py

model = res_net50.get_resnet50()
model.load_weights('tf_resnet50.h5')
for layer in model.layers:
   if layer.__class__.__name__ in ['Convolution1D', 'Convolution2D']:
      original_w = K.get_value(layer.W)
      converted_w = convert_kernel(original_w)
      K.set_value(layer.W, converted_w)
model.save_weights('th_resnet50.h5')

This is the easiest solution I can figure out, but after run this script, switch backend to theano and load 'th_resnet50.h5' we just generated, the test result is still not correct. ('n02443485 black-footed ferret, ferret, musterla nigrips' for both test images)

Perhaps the difference between th and tf is bigger than we expected. And I think this difference would be the souce of many unexpected bugs.

I've updated the links. Now you can download the weights from Google drive.

fchollet · 2016-07-22T18:51:31Z

examples/res_net50.py

@@ -0,0 +1,218 @@
+'''This script demonstrates how to build the resnet50 architecture


File should be renamed to resnet_50

fchollet · 2016-07-24T20:38:29Z

General issues with the PR:

it's for TensorFlow, yet uses the default dim ordering (dim_ordering='th'). This is inefficient since it results in dimension shuffling back and forth with every layer.
the syntax should be made PEP8 compliant.
the docstring should be rewritten.

Specifically, for the docstring:

fix typos
mention of "go to my Github should be removed". Author note is fine.
if the weights are not convertible to Theano, then mention about conversion should be removed.
not sure why two different download links are necessary. Would Google drive not be accessible in China?

fchollet · 2016-07-24T20:41:00Z

Also it would be best to understand why weights are not convertible. Every operation is unit-tested to yield the same result in both Theano and TensorFlow (see backend tests), modulo the weight conversion operation. It should be impossible for a combination of identical operations to yield different results. Most likely an issue with your conversion code.

MoyanZitto · 2016-07-25T03:50:10Z

@fchollet Got it, I'll fix the problems you mentioned soon. BTW Google/Facebook/Twitter and a lot of other websites are not accessible in China because they were blocked by the hateful "Great Firewall". I know it is crazy but it did happen.

fchollet · 2016-07-25T17:45:23Z

@MoyanZitto cool, thank you. Ideally we'd have a way to host model files that isn't Google drive or Baidu. Maybe AWS S3.

…xample

MoyanZitto · 2016-07-26T11:45:20Z

@fchollet Fix some repos, not for sure whether there are still grammar mistakes in the scripts (really sorry for my limited English). If it not too much trouble, you may modify this script as you like.

I noticed that conv layers get the defualt dim_ordering by K.image_dim_ordering(), so I simply use K.set_image_dim_ordering('tf') to change the dim order, is it works?

Although we can visit AWS S3 in China, the speed is very slow... so it's better to retain the Baidu drive link.

fchollet · 2016-07-26T18:42:52Z

examples/resnet_50.py

+    return out
+
+
+def conv_block(input_tensor, nb_filter, stage, block, kernel_size=3):


No need to pass "stage" and "block". They are not used.

I'd rather see conv_block(input_tensor, kernel_size, filters, stride=2)

fchollet · 2016-07-26T19:34:39Z

I noticed that conv layers get the defualt dim_ordering by K.image_dim_ordering(), so I simply use K.set_image_dim_ordering('tf') to change the dim order, is it works?

That's not enough. Image dim ordering is hard-coded in several places of your code, such as when you do merges or when you load input data.

MoyanZitto · 2016-07-27T15:27:23Z

@fchollet Thank you very much for pointing out these mistakes! You are so kind to do so.

These (ugly) names come from Kaiming He's caffe model, see http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006
Since our .h5 weights is converted from .caffemodel file, we have to carefully set the layer names so that the weights could be loaded to the corresponding layer. But you are right, it's better if these names could be removed, I'll have a try tomorrow.

BTW, I don't see anything about dim_ordering in 'merge'. Could you make it more clear?

…xample

MoyanZitto · 2016-08-03T09:49:16Z

@fchollet
Sorry for not updating this script in past few days, I was just busy with looking for a job.
I'm afraid I have to retain these layer names because only when names of model layers and names of corresponding weights get matched, the weights could be set correctly. I've made the code more clear by setthing a "basename", it looks better now.

And, users can set dim_ordering now. I offer both 'tf' dim_ordering weights for acceleration and 'th' dim_odering weights for compatibility (if they want to use this script and their own "th" dim_ordering code jointly). The links are given at the top of the code.

I think "dim_ordering" is just how the input image been organized, it should be nothing to do with the shape of weights of conv layers. Perhaps we should cut off the dependency between input dim_ordering and the shape of conv layers. In this case a single version of weights could be loaded in a model no matter what the image dim_ordering is.

Hope this script get merged soon~~~it feels really good to be a keras contributor!

fchollet · 2016-08-03T17:33:00Z

I think "dim_ordering" is just how the input image been organized, it should be nothing to do with the shape of weights of conv layers. Perhaps we should cut off the dependency between input dim_ordering and the shape of conv layers. In this case a single version of weights could be loaded in a model no matter what the image dim_ordering is.

Not quite true. Kernels have to be transposed. Also the output of the Flatten() layer will be different based on dim ordering and thus the first Dense layer after Flatten should be reshuffled.

The reason why your code appears to run properly is actually that you are setting the dim ordering via K.set_image_dim_ordering, which does not reset the default dim ordering of conv layers (and other layers). I intend to fix this, by the way. What it means is that your conv layers are still using using th dim ordering.

So it appears to me that your support of tf dim ordering isn't correct. For the sake of merging your PR quickly, let's give it up. Please only support th dim ordering (i.e. what you were doing initially). I'll add tf support later on myself, which will involve converting the weights and isn't quite easy.

Otherwise, the code does looks better now, congrats.

fchollet · 2016-08-03T17:43:10Z

Never mind my previous post, it seems I misread your code. Let me check it out again.

fchollet · 2016-08-03T17:51:10Z

Ok, LGTM. Thanks for the valuable contribution!

add resnet50 example

903a669

fix PEP8 problems

f003171

fix PEP8 problem....again...

86bb920

This script get 10 points in PEP8 tests on my computer but.....

MoyanZitto added 3 commits July 21, 2016 16:42

add resnet 50

4684441

latest version

094d06b

fix problem caused by interrupted git push

8b8e5e0

fix PEP8 problem..again!

c7652a0

MoyanZitto added 2 commits July 22, 2016 19:45

update weights links and remove load_weights

23729b3

fix pep8!

6dcfadb

fchollet reviewed Jul 22, 2016
View reviewed changes

MoyanZitto added 2 commits July 23, 2016 19:32

remove skimage dependency, rename the file

f2bff00

fix pep8...

2acf3cd

MoyanZitto added 2 commits July 26, 2016 19:31

Merge branch 'master' of https://github.com/fchollet/keras into add_e…

cf7ee68

…xample

update

2729f0e

fchollet reviewed Jul 26, 2016
View reviewed changes

MoyanZitto added 2 commits August 3, 2016 17:06

Merge branch 'master' of https://github.com/fchollet/keras into add_e…

9aed6eb

…xample

support tf dim_ordering

9d491ce

fix PEP8 problem

5bddc8c

fchollet merged commit c725f8d into keras-team:master Aug 3, 2016

MoyanZitto deleted the add_example branch August 4, 2016 06:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add resnet50 example #3266

add resnet50 example #3266

MoyanZitto commented Jul 20, 2016

giorgiop commented Jul 20, 2016

MoyanZitto commented Jul 20, 2016

fchollet commented Jul 20, 2016

fchollet commented Jul 20, 2016

MoyanZitto commented Jul 21, 2016

MoyanZitto commented Jul 21, 2016

fchollet commented Jul 21, 2016

fchollet commented Jul 22, 2016

MoyanZitto commented Jul 22, 2016

MoyanZitto commented Jul 22, 2016

fchollet Jul 22, 2016

fchollet commented Jul 24, 2016

fchollet commented Jul 24, 2016

MoyanZitto commented Jul 25, 2016

fchollet commented Jul 25, 2016

MoyanZitto commented Jul 26, 2016

fchollet Jul 26, 2016

fchollet Jul 26, 2016

fchollet commented Jul 26, 2016

MoyanZitto commented Jul 27, 2016

MoyanZitto commented Aug 3, 2016

fchollet commented Aug 3, 2016 •

edited

Loading

fchollet commented Aug 3, 2016

fchollet commented Aug 3, 2016

		@@ -0,0 +1,218 @@
		'''This script demonstrates how to build the resnet50 architecture

		return out


		def conv_block(input_tensor, nb_filter, stage, block, kernel_size=3):

add resnet50 example #3266

add resnet50 example #3266

Conversation

MoyanZitto commented Jul 20, 2016

giorgiop commented Jul 20, 2016

MoyanZitto commented Jul 20, 2016

fchollet commented Jul 20, 2016

fchollet commented Jul 20, 2016

MoyanZitto commented Jul 21, 2016

MoyanZitto commented Jul 21, 2016

fchollet commented Jul 21, 2016

fchollet commented Jul 22, 2016

MoyanZitto commented Jul 22, 2016

MoyanZitto commented Jul 22, 2016

fchollet Jul 22, 2016

Choose a reason for hiding this comment

fchollet commented Jul 24, 2016

fchollet commented Jul 24, 2016

MoyanZitto commented Jul 25, 2016

fchollet commented Jul 25, 2016

MoyanZitto commented Jul 26, 2016

fchollet Jul 26, 2016

Choose a reason for hiding this comment

fchollet Jul 26, 2016

Choose a reason for hiding this comment

fchollet commented Jul 26, 2016

MoyanZitto commented Jul 27, 2016

MoyanZitto commented Aug 3, 2016

fchollet commented Aug 3, 2016 • edited Loading

fchollet commented Aug 3, 2016

fchollet commented Aug 3, 2016

fchollet commented Aug 3, 2016 •

edited

Loading